Developing Non-European Translation Pairs in a Medium-Vocabulary Medical Speech Translation System

نویسندگان

  • Pierrette Bouillon
  • Sonia Halimi
  • Yukie Nakao
  • Kyoko Kanzaki
  • Hitoshi Isahara
  • Nikos Tsourakis
  • Marianne Starlander
  • Beth Ann Hockey
  • Manny Rayner
چکیده

We describe recent work on MedSLT, a medium-vocabulary interlingua-based medical speech translation system, focussing on issues that arise when handling languages of which the grammar engineer has little or no knowledge. We describe how we can systematically create and maintain multiple forms of grammars, lexica and interlingual representations, with some versions being used by language informants, and some by grammar engineers. In particular, we describe the advantages of structuring the interlingua definition as a simple semantic grammar, which includes a human-readable surface form. We show how this allows us to rationalise the process of evaluating translations between languages lacking common speakers. The grammar-based interlingua definition can also be used in other ways. We describe two applications: a simple generic tool for debugging to-interlingua translation rules, and a method for improving speech understanding performance by rescoring N-best speech hypothesis lists. Examples presented focus on the concrete case of translation between Japanese and Arabic in both directions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating Task Performance For A Unidirectional Controlled Language Medical Speech Translation System

We present a task-level evaluation of the French to English version of MedSLT, a medium-vocabulary unidirectional controlled language medical speech translation system designed for doctor-patient diagnosis interviews. Our main goal was to establish task performance levels of novice users and compare them to expert users. Tests were carried out on eight medical students with no previous exposure...

متن کامل

Many-to-Many Multilingual Medical Speech Translation on a PDA

Particularly considering the requirement of high reliability, we argue that the most appropriate architecture for a medical speech translator that can be realised using today’s technology combines unidirectional (doctor to patient) translation, medium-vocabulary controlled language coverage, interlingua-based translation, an embedded help component, and deployability on a hand-held hardware pla...

متن کامل

Source-Error Aware Phrase-Based Decoding for Robust Conversational Spoken Language Translation

Spoken language translation (SLT) systems typically follow a pipeline architecture, in which the best automatic speech recognition (ASR) hypothesis of an input utterance is fed into a statistical machine translation (SMT) system. Conversational speech often generates unrecoverable ASR errors owing to its rich vocabulary (e.g. out-of-vocabulary (OOV) named entities). In this paper, we study the ...

متن کامل

Overview of Speech Translation at ATR

A speech translation system will transform a spoken dialogue from the speaker's language to the listener’s automatically and simultaneously. It will undoubtedly be used to overcome language barriers and facilitate communication among the peoples of the world. Creation of such a system will first require developing the various constituent technologies: speech recognition, machine translation, an...

متن کامل

PanDoRA: A Large-scale Two-way Statistical Machine Translation System for Hand-held Devices

The statistical machine translation (SMT) approach has taken a lead place in the field of Machine Translation for its better translation quality and lower cost in training compared to other approaches. However, due to the high demand of computing resources, an SMT system can not be directly run on hand-held devices. Most existing hand-held translation systems are either interlingua-based, which...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008